home *** CD-ROM | disk | FTP | other *** search
-
- DDDDDD FFFFFFF BBBBB
- DDDDDDD FFFFFFF BBBBBBB
- DD DD FF BB BB
- DD DD FFFFF BBBBB
- DD DD FFFFF BB BB
- DDDDDDD FF BBBBBBB
- DDDDDD FF BBBBB
-
- Data Flow Benchmark V 1.6
-
- (c) 1994 by D.Engert
-
-
- 1. Legal stuff
-
- There is no warranty. Use this software on your own risk. Due to the
- complexity and variety of today's hardware and software which may be
- used to run this program, I am not responsible for any damage or loss of
- data caused by use of this software. It was tested very well and is
- expected to work correctly, but nobody can actually guarantee this for
- any circumstances. And because this software is free, you get what you
- pay for ...
-
- This program can be used freely for private or educational purposes. If
- you want to use it for commercial purposes or find any bugs or have
- suggestions about further enhancement, please contact the author.
-
- Author: Detlef D. Engert
- Gruentenweg 14
- D-90471 Nuernberg
- Germany
-
- Fax: +49-911-861319
- Mailer: +49-911-861319 UTC 18:00 - 23:00
- EMail: 2:2490/1145.9@fidonet
- 2:2490/2576@fidonet
- engert@ibm.net
-
- 2. Purpose and intent of this program
-
- Todays hardware gets more and more powerful but more complicated too. Modern
- motherboards using up to date chip sets may turn out to be very difficult to
- configure. And to make things worse, there are different manufacturers of CPU
- chips besides Intel now with new features and options. The memory subsystems
- implemented on these motherboards are even harder to configure, taking into
- consideration different cache strategies, RAM speeds and access modes.
-
- Beyond the core of any computer system lay the peripherals (video, magnetic
- storage..) connected by a variety of bus implementations like ISA, EISA, VLB or
- PCI. Chip sets used on these peripherals are often of even higher complexity
- than the computer core.
-
- Even skilled users are often overwhelmed by the sheer complexity and variety
- of options offered. Nobody will them the real power available to them by a
- given computer system using a particular configuration set. How should a user
- optimize his or her computer or how should a buyer choose between similar
- looking components based on hard facts ? May be this program will help you !
-
-
- 3. What will this program offer ?
-
- Let's have a look on the output (framed) of the current version run on my own
- machine.
-
- Machine configuration:
-
- GigaByte Motherboard,
- Intel 486DX-50 CPU,
- SiS 411 EISA/VLB chip set,
- EISA/VLB bus system running at 50MHz,
- 256 KByte of 12ns cache RAM, 2 Banks a 16 MByte of 70ns DRAM,
- Diamond Viper Pro Video P9100 video board, 4 MByte VRAM
-
- All configuration settings are optimized for maximum throughput. That gives
- the following results:
-
- +-----------------------------------------------------------------------------+
- | Data flow benchmark v1.6 |
- | |
- | copyright (c) 1994 D.Engert; partially based on a DOS implementation by |
- | A.Stiller, c't |
- | |
- | Processor : Intel/AMD 486DX |
- | Clock : 50.0 MHz |
- | Coprocessor : present |
- | Internal bus width : 32 bit between processor and primary cache |
- | External bus width : 32 bit between primary and secondary cache |
- | DRAM page size : 16 KByte, interleaved |
- | MMU cache : 32 entries 4-way set associative, 4KByte per entry |
- | Primary cache : 8 KByte 4-way set associative |
- | Secondary cache : 256 KByte direct mapped |
- | Cache line size : 16 Bytes |
- | Cache strategy : write through |
- +-----------------------------------------------------------------------------+
-
- These figures are quite selfexplanatory. Type and speed of the CPU are detected,
- the width of the data paths between CPU core and primary cache - typically
- located on the same chip as the CPU core - and between the core or primary
- cache and the secondary cache (if present) or main memory.
- The program next tries to determine the effective page size, if page mode is
- implemented by the chip set. The following 3 lines show information about the
- address translation lookaside buffer (MMU cache), the primary and secondary
- cache. Size and associativity are checked, the length of a cache line is
- determined and the strategy used by the cache subsystem (write-through or write-
- back) is sensed.
-
- +-----------------------------------------------------------------------------+
- | Data flow and bus performance memory |
- | |
- | -- memory -> CPU --------- |
- | Maximum 4K FETCH (Hits) : 19.5µs ( 973c) => 200.7MB/s (0.25c/Byte) |
- | 4K FETCH (Miss+Hit) : 29.2µs ( 1461c) => 133.6MB/s (0.37c/Byte) |
- | Minimum 4K FETCH (Misses) : 59.7µs ( 2985c) => 65.4MB/s (0.76c/Byte) |
- | Maximum 4K LODSD (hits) : 83.2µs ( 4160c) => 49.3MB/s (1.02c/Byte) |
- | 4K LODSD (miss+hit) : 97.6µs ( 4882c) => 42.0MB/s (1.19c/Byte) |
- | Minimum 4K LODSD (misses) : 134.5µs ( 6730c) => 30.4MB/s (1.64c/Byte) |
- | -- CPU -> memory --------- |
- | Maximum 4K STOSD (hits) : 82.1µs ( 4106c) => 49.9MB/s (1.00c/Byte) |
- | Minimum 4K STOSD (misses) : 82.1µs ( 4108c) => 49.9MB/s (1.00c/Byte) |
- | -- memory -> memory ------ |
- | Maximum 4K MOVSD (hits) : 65.5µs ( 3275c) => 62.6MB/s (0.80c/Byte) |
- | 4K MOVSD (miss+hit) : 129.9µs ( 6498c) => 31.5MB/s (1.59c/Byte) |
- | 4K MOVSD (clean) : 202.7µs ( 10142c) => 20.2MB/s (2.48c/Byte) |
- | 4K MOVSD (dirty) : 141.3µs ( 7071c) => 29.0MB/s (1.73c/Byte) |
- | Minimum 4K MOVSD (misses) : 203.2µs ( 10166c) => 20.2MB/s (2.48c/Byte) |
- +-----------------------------------------------------------------------------+
-
- These are the performance figures of the CPU <--> memory data path.
-
- There are four disciplines:
- - opcode fetch
- - data load
- - data store
- - data move.
-
- Depending on the discipline several scenarios are tested (denoted in paren-
- theses):
- - hits in all memory caches (hits)
- - hit in secondary cache, but not in primary (miss+hit)
- - hit with replace in clean secondary cache, no write back necessary (clean)
- - hit with replace in dirty secondary cache, write back carried out (dirty)
- - misses in all caches (misses)
-
- The first (hits) should give maximum performance down to the last (misses) with
- minimum speed.
-
- The test transfer size depends on the cache and page sizes. In the case above
- it is 4KByte.
-
- There are four result columns:
- - absolute time needed for one test of the mentioned size
- - the same in clock cycles
- - the resulting transfer speed in MBytes per second
- - the cost of the operation in cycles per Byte
-
- +-----------------------------------------------------------------------------+
- | VIO info : XGA, 0 KByte video memory |
- | Device info : manufacturer Weitek, chip set W5186, 1 MByte video memory |
- | Screen : 1280x1024x256 |
- | Aperture : 4 MByte @ 0xC0800000 |
- | Bus width : 32 bit between CPU and video memory |
- +-----------------------------------------------------------------------------+
-
- The information about the video system is queried from different parts of OS/2,
- so there may be different figures for the same item. That depends more or less
- on how careful the developer of the video drivers did the job...
-
- +-----------------------------------------------------------------------------+
- | Data flow and bus performance video |
- | |
- | -- Video -> CPU ---------- |
- | Maximum 4K LODSD (Hits) : 390.5µs ( 19536c) => 10.5MB/s (4.77c/Byte) |
- | Minimum 4K LODSD (Misses) : 390.4µs ( 19528c) => 10.5MB/s (4.77c/Byte) |
- | -- CPU -> Video ---------- |
- | 4K STOSD : 82.3µs ( 4116c) => 49.8MB/s (1.00c/Byte) |
- | -- Memory -> Video ------- |
- | Maximum 4K MOVSD (Hits) : 86.0µs ( 4301c) => 47.6MB/s (1.05c/Byte) |
- | 4K MOVSD (Miss+Hit) : 133.4µs ( 6674c) => 30.7MB/s (1.63c/Byte) |
- | 4K MOVSD (Clean) : 536.0µs ( 26812c) => 7.6MB/s (6.55c/Byte) |
- | Minimum 4K MOVSD (Misses) : 539.5µs ( 26986c) => 7.6MB/s (6.59c/Byte) |
- | -- Video -> Memory ------- |
- | Maximum 4K MOVSD (Hits) : 536.0µs ( 26812c) => 7.6MB/s (6.55c/Byte) |
- | Minimum 4K MOVSD (Misses) : 539.5µs ( 26986c) => 7.6MB/s (6.59c/Byte) |
- | -- Video -> Video -------- |
- | Maximum 4K MOVSD (Hits) : 562.1µs ( 28118c) => 7.3MB/s (6.86c/Byte) |
- | Minimum 4K MOVSD (Misses) : 678.6µs ( 33946c) => 6.0MB/s (8.29c/Byte) |
- +-----------------------------------------------------------------------------+
-
- This is the same as above, obviously the discipline opcode fetch is left out,
- but there are more transfer data paths.
-
- The figures for data store and move from primary cache into video memory are
- more or less senseless on local bus systems and coprocessed video cards, but
- give at least an idea how careful these buses are implemented.
-
- I didn't comment the actual figures, because each - and most probably your -
- system is different. Compare yourself, I only say that this system is a fast
- one in its category...
-
-
- 4. How do I start this program ?
-
- That's easy: go to a command line and type
-
- DFB [options]
-
- The following options are currently implemented:
-
- /NOV[ideo] : suppress video testing
- /CC:number : set country code to number, default is from CONFIG.SYS
- 49 : Germany (deutsch)
- else : international (english)
- /MORE : stops output after each section to ease reading
- /DMP : dump test values to stderr
- may be redirected to file via 'DFB /DMP [...] 2>filename'
-
- Options are not case sensitive !
-
- The commands 'help DFB0000' or 'DFB /?' will give you the same information about
- the usage of the latest version of DFB.
-
- If you start DFB in a full screen session, video testing will be left out,
- since there is no video aperture available. Better use a windowed OS/2 box.
-
- To access memory and video for testing, the device driver SSMDD.SYS must be
- loaded. It is part of MMPM/2 and included in the distribution of DFB also. Be
- sure, the statement
-
- DEVICE=[path]\SSMDD.SYS
-
- is in your config.sys file. DFB will tell you if it is not. In this case, only
- a small part of the DFB functionality is available (basic CPU type checking).
-
- If you create a minimum boot disk with floppy support only, there is enough
- room left to put SSMDD.SYS and DFB.EXE onto it too. So you may enter your
- favourite computer shop and check out the different machines offered by a mere
- boot from this floppy disk.
-
-
- 5. Is there danger to use this program ?
-
- Yes, there is !
-
- First, I am human so I am error prone :-)
-
- Second, DFB goes to the bones of your computer. Therefore I don't guarantee that
- it will interact with any running program or any active device in a totally
- harmless manner. If you plan to start DFB, I recommend to stop any other running
- user process and wait until all sensitive devices are idle. That is not a must
- but reduces any risks (I run DFB parallel to my communications software, active
- CD-ROM and active audio system).
-
-
- 6. A whish of the author
-
- Since I have access only to Intel 486 class machines, I would appreciate if you
- run the tests using the dump option and drop me an email with a description
- of your system and the resulting dump output. If you can't reach me through
- Fidonet you may send your message to Compuserve 100275,3253. I am interested in
- any non i486 CPU.
- Everybody who provides me new information may consider him/herself as a
- registered user of a forthcoming shareware version (if there will be one...)
-
- ---
-